-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow duplicated model names across packages #3053
Conversation
@jtcohen6 what say ye - is this a change we're interested in making in dbt? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on board. The constructs hold together better than I expected, to be honest. I do think we need an answer for resource properties (description
, tests
, etc), which may be straightforward or may be nefariously complex.
Blocking
Can we enable namespacing for resource properties? In the event that there are multiple, could we safely assume that the user is trying to describe properties for the resource in the current package? It would be a breaking change to prevent users from defining properties for resources in other packages, but this is only possible for package resources that don't already have properties themselves, anyway.
Compilation Error
dbt found two schema.yml entries for the same resource named model_a. Resources and their associated columns may only be described a single time. To fix this, remove the resource entry for model_a in one of these files:
- models/other_schema.yml
models/schema.yml
At the very least, we should update that error message to include which package the .yml
files are in.
Not blocking
$ dbt run -m model_a
includes both models! I guess that makes sense, it just feels a little funny to me. Obviously, $ dbt run -m root_project.model_a
selects just the one.
The docs site actually works pretty well, it's just the same thing as above: focusing on +model_a+
in the DAG viz shows the subgraphs of both models. I think the fix here would be pretty simple: We should fully qualify resource names (including package name) in the DAG side-view launcher, and as the result of "Refocus on node."
Should be:
Of course, that still elides the UX issues of displaying two nodes with the same name, and no visual components to distinguish them. I don't find myself all that troubled, though: in part, because the package dropdown selector is first class in the DAG viz; in part, because I still wouldn't recommend doing this, exactly because it risks ambiguity. Sure, there are things we could do to clear up that ambiguity—or better enable disambiguation, at least—and they're things we're already thinking about (dbt-labs/dbt-docs#122).
# have unique names within a package | ||
continue | ||
|
||
if len(matched) > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do wonder if, in the event that there are same-named models in the current project and in an installed package, we should still support single-argument ref
and default to the resource in the same namespace. We would only raise this error if all the resources by that name are in other namespaces.
My motivation here: It's possible that a user wants to install a package that has a same-named model as one in their project, and the models in the package only uses single-argument ref
. Of course, I think there's a lot of benefit in being more explicit to avoid surprises, so alternatively we'd establish a convention that package authors ought to use two-argument ref
everywhere. It wouldn't be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea!
Hi @drewbanin, |
This remains a good idea. The blocking issue also remains: it's not possible to define resource properties for two resources with the same name, even if they live in different packages. I'd have appetite to take another swing at this after v1.0—so, likely next year. This will serve as a solid jumping-off point when we're ready to take that swing. |
resolves #1269 (partially?)
Description
This PR makes it possible to have refable nodes (models, seeds, and snapshots) with duplicated names across different packages. Refable nodes with duplicated names continue to be disallowed within a specific package.
TODO:
Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.